Overview

Dataset statistics

Number of variables31
Number of observations71236
Missing cells144993
Missing cells (%)6.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory16.8 MiB
Average record size in memory248.0 B

Variable types

Numeric11
Categorical13
Text5
Boolean2

Alerts

country has constant value ""Constant
weight is highly overall correlated with glucose_test_resultHigh correlation
glucose_test_result is highly overall correlated with weightHigh correlation
change_in_meds_during_hospitalization is highly overall correlated with prescribed_diabetes_medsHigh correlation
prescribed_diabetes_meds is highly overall correlated with change_in_meds_during_hospitalizationHigh correlation
readmitted_binary is highly overall correlated with readmitted_multiclassHigh correlation
readmitted_multiclass is highly overall correlated with readmitted_binaryHigh correlation
race is highly imbalanced (56.0%)Imbalance
weight is highly imbalanced (92.0%)Imbalance
discharge_disposition is highly imbalanced (58.4%)Imbalance
admission_source is highly imbalanced (64.0%)Imbalance
race has 3554 (5.0%) missing valuesMissing
age has 3557 (5.0%) missing valuesMissing
admission_type has 3706 (5.2%) missing valuesMissing
discharge_disposition has 2590 (3.6%) missing valuesMissing
admission_source has 4718 (6.6%) missing valuesMissing
glucose_test_result has 67548 (94.8%) missing valuesMissing
a1c_test_result has 59320 (83.3%) missing valuesMissing
emergency_visits_in_previous_year is highly skewed (γ1 = 22.47573232)Skewed
encounter_id has unique valuesUnique
outpatient_visits_in_previous_year has 59587 (83.6%) zerosZeros
emergency_visits_in_previous_year has 63242 (88.8%) zerosZeros
inpatient_visits_in_previous_year has 47231 (66.3%) zerosZeros
non_lab_procedures has 32632 (45.8%) zerosZeros

Reproduction

Analysis started2023-11-07 10:34:19.418091
Analysis finished2023-11-07 10:34:46.225470
Duration26.81 seconds
Software versionydata-profiling vv4.6.1
Download configurationconfig.json

Variables

encounter_id
Real number (ℝ)

UNIQUE 

Distinct71236
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean551634.26
Minimum100011
Maximum999979
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size556.7 KiB
2023-11-07T10:34:46.336721image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum100011
5-th percentile146758
Q1326583
median552684
Q3775671
95-th percentile955575.5
Maximum999979
Range899968
Interquartile range (IQR)449088

Descriptive statistics

Standard deviation259387
Coefficient of variation (CV)0.47021555
Kurtosis-1.1977773
Mean551634.26
Median Absolute Deviation (MAD)224356.5
Skewness-0.0066514534
Sum3.9296218 × 1010
Variance6.7281618 × 1010
MonotonicityNot monotonic
2023-11-07T10:34:46.543331image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
533192 1
 
< 0.1%
918595 1
 
< 0.1%
578998 1
 
< 0.1%
224573 1
 
< 0.1%
542068 1
 
< 0.1%
985728 1
 
< 0.1%
315190 1
 
< 0.1%
774724 1
 
< 0.1%
815392 1
 
< 0.1%
804149 1
 
< 0.1%
Other values (71226) 71226
> 99.9%
ValueCountFrequency (%)
100011 1
< 0.1%
100014 1
< 0.1%
100020 1
< 0.1%
100030 1
< 0.1%
100047 1
< 0.1%
100049 1
< 0.1%
100063 1
< 0.1%
100068 1
< 0.1%
100077 1
< 0.1%
100113 1
< 0.1%
ValueCountFrequency (%)
999979 1
< 0.1%
999968 1
< 0.1%
999966 1
< 0.1%
999963 1
< 0.1%
999951 1
< 0.1%
999949 1
< 0.1%
999942 1
< 0.1%
999941 1
< 0.1%
999940 1
< 0.1%
999932 1
< 0.1%

country
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size556.7 KiB
USA
71236 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters213708
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUSA
2nd rowUSA
3rd rowUSA
4th rowUSA
5th rowUSA

Common Values

ValueCountFrequency (%)
USA 71236
100.0%

Length

2023-11-07T10:34:46.734162image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-07T10:34:46.859153image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
usa 71236
100.0%

Most occurring characters

ValueCountFrequency (%)
U 71236
33.3%
S 71236
33.3%
A 71236
33.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 213708
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
U 71236
33.3%
S 71236
33.3%
A 71236
33.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 213708
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
U 71236
33.3%
S 71236
33.3%
A 71236
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 213708
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
U 71236
33.3%
S 71236
33.3%
A 71236
33.3%

patient_id
Real number (ℝ)

Distinct53985
Distinct (%)75.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean54302279
Minimum135
Maximum1.8950262 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size556.7 KiB
2023-11-07T10:34:47.015311image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum135
5-th percentile1422681.8
Q123396510
median45305631
Q387558374
95-th percentile1.1153412 × 108
Maximum1.8950262 × 108
Range1.8950248 × 108
Interquartile range (IQR)64161864

Descriptive statistics

Standard deviation38795850
Coefficient of variation (CV)0.71444239
Kurtosis-0.33336755
Mean54302279
Median Absolute Deviation (MAD)32995930
Skewness0.47990669
Sum3.8682772 × 1012
Variance1.505118 × 1015
MonotonicityNot monotonic
2023-11-07T10:34:47.234031image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
88785891 33
 
< 0.1%
1660293 19
 
< 0.1%
84428613 18
 
< 0.1%
37096866 17
 
< 0.1%
23199021 15
 
< 0.1%
88227540 15
 
< 0.1%
90609804 15
 
< 0.1%
89472402 15
 
< 0.1%
88681950 15
 
< 0.1%
97391007 15
 
< 0.1%
Other values (53975) 71059
99.8%
ValueCountFrequency (%)
135 2
 
< 0.1%
729 1
 
< 0.1%
774 1
 
< 0.1%
1152 5
< 0.1%
1314 3
< 0.1%
1629 1
 
< 0.1%
5220 2
 
< 0.1%
5337 1
 
< 0.1%
6174 1
 
< 0.1%
6309 1
 
< 0.1%
ValueCountFrequency (%)
189502619 1
< 0.1%
189445127 1
< 0.1%
189365864 1
< 0.1%
189351095 1
< 0.1%
189349430 1
< 0.1%
189298877 1
< 0.1%
189257846 2
< 0.1%
189215762 1
< 0.1%
189195422 1
< 0.1%
189179321 1
< 0.1%

race
Categorical

IMBALANCE  MISSING 

Distinct6
Distinct (%)< 0.1%
Missing3554
Missing (%)5.0%
Memory size556.7 KiB
Caucasian
50693 
AfricanAmerican
12693 
?
 
1516
Hispanic
 
1364
Other
 
995

Length

Max length15
Median length9
Mean length9.8422032
Min length1

Characters and Unicode

Total characters666140
Distinct characters18
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCaucasian
2nd rowAfricanAmerican
3rd rowCaucasian
4th rowAfricanAmerican
5th rowCaucasian

Common Values

ValueCountFrequency (%)
Caucasian 50693
71.2%
AfricanAmerican 12693
 
17.8%
? 1516
 
2.1%
Hispanic 1364
 
1.9%
Other 995
 
1.4%
Asian 421
 
0.6%
(Missing) 3554
 
5.0%

Length

2023-11-07T10:34:47.437209image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-07T10:34:47.593371image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
caucasian 50693
74.9%
africanamerican 12693
 
18.8%
1516
 
2.2%
hispanic 1364
 
2.0%
other 995
 
1.5%
asian 421
 
0.6%

Most occurring characters

ValueCountFrequency (%)
a 179250
26.9%
i 79228
11.9%
n 77864
11.7%
c 77443
11.6%
s 52478
 
7.9%
C 50693
 
7.6%
u 50693
 
7.6%
r 26381
 
4.0%
A 25807
 
3.9%
e 13688
 
2.1%
Other values (8) 32615
 
4.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 585765
87.9%
Uppercase Letter 78859
 
11.8%
Other Punctuation 1516
 
0.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 179250
30.6%
i 79228
13.5%
n 77864
13.3%
c 77443
13.2%
s 52478
 
9.0%
u 50693
 
8.7%
r 26381
 
4.5%
e 13688
 
2.3%
f 12693
 
2.2%
m 12693
 
2.2%
Other values (3) 3354
 
0.6%
Uppercase Letter
ValueCountFrequency (%)
C 50693
64.3%
A 25807
32.7%
H 1364
 
1.7%
O 995
 
1.3%
Other Punctuation
ValueCountFrequency (%)
? 1516
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 664624
99.8%
Common 1516
 
0.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 179250
27.0%
i 79228
11.9%
n 77864
11.7%
c 77443
11.7%
s 52478
 
7.9%
C 50693
 
7.6%
u 50693
 
7.6%
r 26381
 
4.0%
A 25807
 
3.9%
e 13688
 
2.1%
Other values (7) 31099
 
4.7%
Common
ValueCountFrequency (%)
? 1516
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 666140
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 179250
26.9%
i 79228
11.9%
n 77864
11.7%
c 77443
11.6%
s 52478
 
7.9%
C 50693
 
7.6%
u 50693
 
7.6%
r 26381
 
4.0%
A 25807
 
3.9%
e 13688
 
2.1%
Other values (8) 32615
 
4.9%

gender
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size556.7 KiB
Female
38228 
Male
33005 
Unknown/Invalid
 
3

Length

Max length15
Median length6
Mean length5.0737408
Min length4

Characters and Unicode

Total characters361433
Distinct characters16
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFemale
2nd rowMale
3rd rowFemale
4th rowMale
5th rowFemale

Common Values

ValueCountFrequency (%)
Female 38228
53.7%
Male 33005
46.3%
Unknown/Invalid 3
 
< 0.1%

Length

2023-11-07T10:34:47.765297image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-07T10:34:47.921464image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
female 38228
53.7%
male 33005
46.3%
unknown/invalid 3
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
e 109461
30.3%
a 71236
19.7%
l 71236
19.7%
F 38228
 
10.6%
m 38228
 
10.6%
M 33005
 
9.1%
n 12
 
< 0.1%
U 3
 
< 0.1%
k 3
 
< 0.1%
o 3
 
< 0.1%
Other values (6) 18
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 290191
80.3%
Uppercase Letter 71239
 
19.7%
Other Punctuation 3
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 109461
37.7%
a 71236
24.5%
l 71236
24.5%
m 38228
 
13.2%
n 12
 
< 0.1%
k 3
 
< 0.1%
o 3
 
< 0.1%
w 3
 
< 0.1%
v 3
 
< 0.1%
i 3
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
F 38228
53.7%
M 33005
46.3%
U 3
 
< 0.1%
I 3
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
/ 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 361430
> 99.9%
Common 3
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 109461
30.3%
a 71236
19.7%
l 71236
19.7%
F 38228
 
10.6%
m 38228
 
10.6%
M 33005
 
9.1%
n 12
 
< 0.1%
U 3
 
< 0.1%
k 3
 
< 0.1%
o 3
 
< 0.1%
Other values (5) 15
 
< 0.1%
Common
ValueCountFrequency (%)
/ 3
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 361433
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 109461
30.3%
a 71236
19.7%
l 71236
19.7%
F 38228
 
10.6%
m 38228
 
10.6%
M 33005
 
9.1%
n 12
 
< 0.1%
U 3
 
< 0.1%
k 3
 
< 0.1%
o 3
 
< 0.1%
Other values (6) 18
 
< 0.1%

age
Categorical

MISSING 

Distinct10
Distinct (%)< 0.1%
Missing3557
Missing (%)5.0%
Memory size556.7 KiB
[70-80)
17359 
[60-70)
14908 
[80-90)
11510 
[50-60)
11423 
[40-50)
6418 
Other values (5)
6061 

Length

Max length8
Median length7
Mean length7.0261529
Min length6

Characters and Unicode

Total characters475523
Distinct characters13
Distinct categories4 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row[70-80)
2nd row[50-60)
3rd row[60-70)
4th row[60-70)
5th row[70-80)

Common Values

ValueCountFrequency (%)
[70-80) 17359
24.4%
[60-70) 14908
20.9%
[80-90) 11510
16.2%
[50-60) 11423
16.0%
[40-50) 6418
 
9.0%
[30-40) 2536
 
3.6%
[90-100) 1875
 
2.6%
[20-30) 1071
 
1.5%
[10-20) 474
 
0.7%
[0-10) 105
 
0.1%
(Missing) 3557
 
5.0%

Length

2023-11-07T10:34:48.093285image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-07T10:34:48.280865image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
70-80 17359
25.6%
60-70 14908
22.0%
80-90 11510
17.0%
50-60 11423
16.9%
40-50 6418
 
9.5%
30-40 2536
 
3.7%
90-100 1875
 
2.8%
20-30 1071
 
1.6%
10-20 474
 
0.7%
0-10 105
 
0.2%

Most occurring characters

ValueCountFrequency (%)
0 137233
28.9%
[ 67679
14.2%
- 67679
14.2%
) 67679
14.2%
7 32267
 
6.8%
8 28869
 
6.1%
6 26331
 
5.5%
5 17841
 
3.8%
9 13385
 
2.8%
4 8954
 
1.9%
Other values (3) 7606
 
1.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 272486
57.3%
Open Punctuation 67679
 
14.2%
Dash Punctuation 67679
 
14.2%
Close Punctuation 67679
 
14.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 137233
50.4%
7 32267
 
11.8%
8 28869
 
10.6%
6 26331
 
9.7%
5 17841
 
6.5%
9 13385
 
4.9%
4 8954
 
3.3%
3 3607
 
1.3%
1 2454
 
0.9%
2 1545
 
0.6%
Open Punctuation
ValueCountFrequency (%)
[ 67679
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 67679
100.0%
Close Punctuation
ValueCountFrequency (%)
) 67679
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 475523
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 137233
28.9%
[ 67679
14.2%
- 67679
14.2%
) 67679
14.2%
7 32267
 
6.8%
8 28869
 
6.1%
6 26331
 
5.5%
5 17841
 
3.8%
9 13385
 
2.8%
4 8954
 
1.9%
Other values (3) 7606
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 475523
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 137233
28.9%
[ 67679
14.2%
- 67679
14.2%
) 67679
14.2%
7 32267
 
6.8%
8 28869
 
6.1%
6 26331
 
5.5%
5 17841
 
3.8%
9 13385
 
2.8%
4 8954
 
1.9%
Other values (3) 7606
 
1.6%

weight
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size556.7 KiB
?
68990 
[75-100)
 
933
[50-75)
 
636
[100-125)
 
449
[125-150)
 
96
Other values (5)
 
132

Length

Max length9
Median length1
Mean length1.2177831
Min length1

Characters and Unicode

Total characters86750
Distinct characters10
Distinct categories6 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row?
2nd row?
3rd row?
4th row?
5th row?

Common Values

ValueCountFrequency (%)
? 68990
96.8%
[75-100) 933
 
1.3%
[50-75) 636
 
0.9%
[100-125) 449
 
0.6%
[125-150) 96
 
0.1%
[25-50) 67
 
0.1%
[0-25) 35
 
< 0.1%
[150-175) 21
 
< 0.1%
[175-200) 7
 
< 0.1%
>200 2
 
< 0.1%

Length

2023-11-07T10:34:48.499527image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-07T10:34:48.686997image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
68990
96.8%
75-100 933
 
1.3%
50-75 636
 
0.9%
100-125 449
 
0.6%
125-150 96
 
0.1%
25-50 67
 
0.1%
0-25 35
 
< 0.1%
150-175 21
 
< 0.1%
175-200 7
 
< 0.1%
200 2
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
? 68990
79.5%
0 3637
 
4.2%
5 3064
 
3.5%
[ 2244
 
2.6%
- 2244
 
2.6%
) 2244
 
2.6%
1 2072
 
2.4%
7 1597
 
1.8%
2 656
 
0.8%
> 2
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Other Punctuation 68990
79.5%
Decimal Number 11026
 
12.7%
Open Punctuation 2244
 
2.6%
Dash Punctuation 2244
 
2.6%
Close Punctuation 2244
 
2.6%
Math Symbol 2
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 3637
33.0%
5 3064
27.8%
1 2072
18.8%
7 1597
14.5%
2 656
 
5.9%
Other Punctuation
ValueCountFrequency (%)
? 68990
100.0%
Open Punctuation
ValueCountFrequency (%)
[ 2244
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2244
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2244
100.0%
Math Symbol
ValueCountFrequency (%)
> 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 86750
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
? 68990
79.5%
0 3637
 
4.2%
5 3064
 
3.5%
[ 2244
 
2.6%
- 2244
 
2.6%
) 2244
 
2.6%
1 2072
 
2.4%
7 1597
 
1.8%
2 656
 
0.8%
> 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 86750
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
? 68990
79.5%
0 3637
 
4.2%
5 3064
 
3.5%
[ 2244
 
2.6%
- 2244
 
2.6%
) 2244
 
2.6%
1 2072
 
2.4%
7 1597
 
1.8%
2 656
 
0.8%
> 2
 
< 0.1%

payer_code
Categorical

Distinct18
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size556.7 KiB
?
28201 
MC
22683 
HM
4319 
SP
3541 
BC
3292 
Other values (13)
9200 

Length

Max length2
Median length2
Mean length1.6041187
Min length1

Characters and Unicode

Total characters114271
Distinct characters17
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row?
2nd row?
3rd row?
4th rowMC
5th rowHM

Common Values

ValueCountFrequency (%)
? 28201
39.6%
MC 22683
31.8%
HM 4319
 
6.1%
SP 3541
 
5.0%
BC 3292
 
4.6%
MD 2484
 
3.5%
CP 1762
 
2.5%
UN 1733
 
2.4%
CM 1347
 
1.9%
OG 729
 
1.0%
Other values (8) 1145
 
1.6%

Length

2023-11-07T10:34:48.874556image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
28201
39.6%
mc 22683
31.8%
hm 4319
 
6.1%
sp 3541
 
5.0%
bc 3292
 
4.6%
md 2484
 
3.5%
cp 1762
 
2.5%
un 1733
 
2.4%
cm 1347
 
1.9%
og 729
 
1.0%
Other values (8) 1145
 
1.6%

Most occurring characters

ValueCountFrequency (%)
M 31265
27.4%
C 29285
25.6%
? 28201
24.7%
P 5762
 
5.0%
H 4421
 
3.9%
S 3586
 
3.1%
B 3292
 
2.9%
D 2862
 
2.5%
N 1733
 
1.5%
U 1733
 
1.5%
Other values (7) 2131
 
1.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 86070
75.3%
Other Punctuation 28201
 
24.7%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
M 31265
36.3%
C 29285
34.0%
P 5762
 
6.7%
H 4421
 
5.1%
S 3586
 
4.2%
B 3292
 
3.8%
D 2862
 
3.3%
N 1733
 
2.0%
U 1733
 
2.0%
O 1195
 
1.4%
Other values (6) 936
 
1.1%
Other Punctuation
ValueCountFrequency (%)
? 28201
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 86070
75.3%
Common 28201
 
24.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
M 31265
36.3%
C 29285
34.0%
P 5762
 
6.7%
H 4421
 
5.1%
S 3586
 
4.2%
B 3292
 
3.8%
D 2862
 
3.3%
N 1733
 
2.0%
U 1733
 
2.0%
O 1195
 
1.4%
Other values (6) 936
 
1.1%
Common
ValueCountFrequency (%)
? 28201
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 114271
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
M 31265
27.4%
C 29285
25.6%
? 28201
24.7%
P 5762
 
5.0%
H 4421
 
3.9%
S 3586
 
3.1%
B 3292
 
2.9%
D 2862
 
2.5%
N 1733
 
1.5%
U 1733
 
1.5%
Other values (7) 2131
 
1.9%

outpatient_visits_in_previous_year
Real number (ℝ)

ZEROS 

Distinct38
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.36958841
Minimum0
Maximum42
Zeros59587
Zeros (%)83.6%
Negative0
Negative (%)0.0%
Memory size556.7 KiB
2023-11-07T10:34:49.046307image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2
Maximum42
Range42
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.2874691
Coefficient of variation (CV)3.4835213
Kurtosis153.15694
Mean0.36958841
Median Absolute Deviation (MAD)0
Skewness9.0844775
Sum26328
Variance1.6575767
MonotonicityNot monotonic
2023-11-07T10:34:49.233887image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=38)
ValueCountFrequency (%)
0 59587
83.6%
1 5967
 
8.4%
2 2465
 
3.5%
3 1431
 
2.0%
4 760
 
1.1%
5 370
 
0.5%
6 212
 
0.3%
7 107
 
0.2%
8 67
 
0.1%
9 59
 
0.1%
Other values (28) 211
 
0.3%
ValueCountFrequency (%)
0 59587
83.6%
1 5967
 
8.4%
2 2465
 
3.5%
3 1431
 
2.0%
4 760
 
1.1%
5 370
 
0.5%
6 212
 
0.3%
7 107
 
0.2%
8 67
 
0.1%
9 59
 
0.1%
ValueCountFrequency (%)
42 1
< 0.1%
39 1
< 0.1%
38 1
< 0.1%
37 1
< 0.1%
36 1
< 0.1%
35 2
< 0.1%
34 1
< 0.1%
33 1
< 0.1%
29 2
< 0.1%
28 1
< 0.1%

emergency_visits_in_previous_year
Real number (ℝ)

SKEWED  ZEROS 

Distinct30
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.19624909
Minimum0
Maximum76
Zeros63242
Zeros (%)88.8%
Negative0
Negative (%)0.0%
Memory size556.7 KiB
2023-11-07T10:34:49.405741image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum76
Range76
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.91085372
Coefficient of variation (CV)4.6413144
Kurtosis1216.0355
Mean0.19624909
Median Absolute Deviation (MAD)0
Skewness22.475732
Sum13980
Variance0.82965449
MonotonicityNot monotonic
2023-11-07T10:34:49.577499image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
0 63242
88.8%
1 5422
 
7.6%
2 1432
 
2.0%
3 502
 
0.7%
4 262
 
0.4%
5 126
 
0.2%
6 71
 
0.1%
7 47
 
0.1%
10 24
 
< 0.1%
8 23
 
< 0.1%
Other values (20) 85
 
0.1%
ValueCountFrequency (%)
0 63242
88.8%
1 5422
 
7.6%
2 1432
 
2.0%
3 502
 
0.7%
4 262
 
0.4%
5 126
 
0.2%
6 71
 
0.1%
7 47
 
0.1%
8 23
 
< 0.1%
9 18
 
< 0.1%
ValueCountFrequency (%)
76 1
 
< 0.1%
63 1
 
< 0.1%
42 1
 
< 0.1%
37 1
 
< 0.1%
29 1
 
< 0.1%
28 1
 
< 0.1%
25 1
 
< 0.1%
24 1
 
< 0.1%
22 5
< 0.1%
21 2
 
< 0.1%

inpatient_visits_in_previous_year
Real number (ℝ)

ZEROS 

Distinct21
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.64015385
Minimum0
Maximum21
Zeros47231
Zeros (%)66.3%
Negative0
Negative (%)0.0%
Memory size556.7 KiB
2023-11-07T10:34:49.749469image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile3
Maximum21
Range21
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.2672711
Coefficient of variation (CV)1.9796352
Kurtosis20.274024
Mean0.64015385
Median Absolute Deviation (MAD)0
Skewness3.5752115
Sum45602
Variance1.605976
MonotonicityNot monotonic
2023-11-07T10:34:49.905588image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=21)
ValueCountFrequency (%)
0 47231
66.3%
1 13706
 
19.2%
2 5303
 
7.4%
3 2407
 
3.4%
4 1159
 
1.6%
5 553
 
0.8%
6 361
 
0.5%
7 195
 
0.3%
8 109
 
0.2%
9 81
 
0.1%
Other values (11) 131
 
0.2%
ValueCountFrequency (%)
0 47231
66.3%
1 13706
 
19.2%
2 5303
 
7.4%
3 2407
 
3.4%
4 1159
 
1.6%
5 553
 
0.8%
6 361
 
0.5%
7 195
 
0.3%
8 109
 
0.2%
9 81
 
0.1%
ValueCountFrequency (%)
21 1
 
< 0.1%
19 2
 
< 0.1%
18 1
 
< 0.1%
17 1
 
< 0.1%
16 3
 
< 0.1%
15 4
 
< 0.1%
14 4
 
< 0.1%
13 15
< 0.1%
12 25
< 0.1%
11 33
< 0.1%

admission_type
Categorical

MISSING 

Distinct7
Distinct (%)< 0.1%
Missing3706
Missing (%)5.2%
Memory size556.7 KiB
Emergency
37742 
Elective
13211 
Urgent
13024 
Not Available
 
3320
Not Mapped
 
214
Other values (2)
 
19

Length

Max length13
Median length9
Mean length8.4261958
Min length6

Characters and Unicode

Total characters569021
Distinct characters26
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowEmergency
2nd rowEmergency
3rd rowEmergency
4th rowElective
5th rowEmergency

Common Values

ValueCountFrequency (%)
Emergency 37742
53.0%
Elective 13211
 
18.5%
Urgent 13024
 
18.3%
Not Available 3320
 
4.7%
Not Mapped 214
 
0.3%
Trauma Center 13
 
< 0.1%
Newborn 6
 
< 0.1%
(Missing) 3706
 
5.2%

Length

2023-11-07T10:34:50.093169image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-07T10:34:50.265028image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
emergency 37742
53.1%
elective 13211
 
18.6%
urgent 13024
 
18.3%
not 3534
 
5.0%
available 3320
 
4.7%
mapped 214
 
0.3%
trauma 13
 
< 0.1%
center 13
 
< 0.1%
newborn 6
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
e 118496
20.8%
E 50953
9.0%
c 50953
9.0%
r 50798
8.9%
n 50785
8.9%
g 50766
8.9%
m 37755
 
6.6%
y 37742
 
6.6%
t 29782
 
5.2%
l 19851
 
3.5%
Other values (16) 71140
12.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 494397
86.9%
Uppercase Letter 71077
 
12.5%
Space Separator 3547
 
0.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 118496
24.0%
c 50953
10.3%
r 50798
10.3%
n 50785
10.3%
g 50766
10.3%
m 37755
 
7.6%
y 37742
 
7.6%
t 29782
 
6.0%
l 19851
 
4.0%
i 16531
 
3.3%
Other values (8) 30938
 
6.3%
Uppercase Letter
ValueCountFrequency (%)
E 50953
71.7%
U 13024
 
18.3%
N 3540
 
5.0%
A 3320
 
4.7%
M 214
 
0.3%
T 13
 
< 0.1%
C 13
 
< 0.1%
Space Separator
ValueCountFrequency (%)
3547
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 565474
99.4%
Common 3547
 
0.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 118496
21.0%
E 50953
9.0%
c 50953
9.0%
r 50798
9.0%
n 50785
9.0%
g 50766
9.0%
m 37755
 
6.7%
y 37742
 
6.7%
t 29782
 
5.3%
l 19851
 
3.5%
Other values (15) 67593
12.0%
Common
ValueCountFrequency (%)
3547
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 569021
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 118496
20.8%
E 50953
9.0%
c 50953
9.0%
r 50798
8.9%
n 50785
8.9%
g 50766
8.9%
m 37755
 
6.6%
y 37742
 
6.6%
t 29782
 
5.2%
l 19851
 
3.5%
Other values (16) 71140
12.5%
Distinct69
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size556.7 KiB
2023-11-07T10:34:50.436881image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Length

Max length36
Median length33
Mean length8.6286709
Min length1

Characters and Unicode

Total characters614672
Distinct characters42
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)< 0.1%

Sample

1st rowFamily/GeneralPractice
2nd row?
3rd rowFamily/GeneralPractice
4th rowInternalMedicine
5th row?
ValueCountFrequency (%)
34922
49.0%
internalmedicine 10292
 
14.4%
emergency/trauma 5319
 
7.5%
family/generalpractice 5217
 
7.3%
cardiology 3716
 
5.2%
surgery-general 2144
 
3.0%
nephrology 1136
 
1.6%
orthopedics 954
 
1.3%
orthopedics-reconstructive 867
 
1.2%
radiologist 817
 
1.1%
Other values (59) 5852
 
8.2%
2023-11-07T10:34:50.827466image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 73743
 
12.0%
r 53825
 
8.8%
a 49937
 
8.1%
n 48349
 
7.9%
i 44479
 
7.2%
c 35137
 
5.7%
? 34922
 
5.7%
l 34274
 
5.6%
y 24470
 
4.0%
t 23985
 
3.9%
Other values (32) 191551
31.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 495142
80.6%
Uppercase Letter 68812
 
11.2%
Other Punctuation 46106
 
7.5%
Dash Punctuation 4612
 
0.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 73743
14.9%
r 53825
10.9%
a 49937
10.1%
n 48349
9.8%
i 44479
9.0%
c 35137
7.1%
l 34274
6.9%
y 24470
 
4.9%
t 23985
 
4.8%
o 23879
 
4.8%
Other values (11) 83064
16.8%
Uppercase Letter
ValueCountFrequency (%)
M 10589
15.4%
I 10324
15.0%
G 8312
12.1%
P 7331
10.7%
T 5868
8.5%
E 5529
8.0%
F 5223
7.6%
C 4381
6.4%
S 3577
 
5.2%
O 2887
 
4.2%
Other values (7) 4791
7.0%
Other Punctuation
ValueCountFrequency (%)
? 34922
75.7%
/ 11159
 
24.2%
& 25
 
0.1%
Dash Punctuation
ValueCountFrequency (%)
- 4612
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 563954
91.7%
Common 50718
 
8.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 73743
13.1%
r 53825
 
9.5%
a 49937
 
8.9%
n 48349
 
8.6%
i 44479
 
7.9%
c 35137
 
6.2%
l 34274
 
6.1%
y 24470
 
4.3%
t 23985
 
4.3%
o 23879
 
4.2%
Other values (28) 151876
26.9%
Common
ValueCountFrequency (%)
? 34922
68.9%
/ 11159
 
22.0%
- 4612
 
9.1%
& 25
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 614672
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 73743
 
12.0%
r 53825
 
8.8%
a 49937
 
8.1%
n 48349
 
7.9%
i 44479
 
7.2%
c 35137
 
5.7%
? 34922
 
5.7%
l 34274
 
5.6%
y 24470
 
4.0%
t 23985
 
3.9%
Other values (32) 191551
31.2%

average_pulse_bpm
Real number (ℝ)

Distinct80
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean99.486622
Minimum60
Maximum139
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size556.7 KiB
2023-11-07T10:34:51.030569image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum60
5-th percentile63
Q179
median100
Q3119
95-th percentile136
Maximum139
Range79
Interquartile range (IQR)40

Descriptive statistics

Standard deviation23.098802
Coefficient of variation (CV)0.23217998
Kurtosis-1.1982491
Mean99.486622
Median Absolute Deviation (MAD)20
Skewness-0.0014253215
Sum7087029
Variance533.55463
MonotonicityNot monotonic
2023-11-07T10:34:51.405525image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
65 952
 
1.3%
105 951
 
1.3%
90 942
 
1.3%
61 937
 
1.3%
68 933
 
1.3%
131 925
 
1.3%
103 923
 
1.3%
126 923
 
1.3%
73 920
 
1.3%
80 917
 
1.3%
Other values (70) 61913
86.9%
ValueCountFrequency (%)
60 895
1.3%
61 937
1.3%
62 906
1.3%
63 829
1.2%
64 872
1.2%
65 952
1.3%
66 895
1.3%
67 906
1.3%
68 933
1.3%
69 852
1.2%
ValueCountFrequency (%)
139 855
1.2%
138 889
1.2%
137 908
1.3%
136 914
1.3%
135 900
1.3%
134 894
1.3%
133 869
1.2%
132 882
1.2%
131 925
1.3%
130 917
1.3%

discharge_disposition
Categorical

IMBALANCE  MISSING 

Distinct25
Distinct (%)< 0.1%
Missing2590
Missing (%)3.6%
Memory size556.7 KiB
Discharged to home
42256 
Discharged/transferred to SNF
9780 
Discharged/transferred to home with home health service
9005 
Discharged/transferred to another short term hospital
 
1488
Discharged/transferred to another rehab fac including rehab units of a hospital .
 
1393
Other values (20)
4724 

Length

Max length105
Median length18
Mean length27.255689
Min length7

Characters and Unicode

Total characters1870994
Distinct characters38
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowDischarged to home
2nd rowDischarged/transferred to a federal health care facility.
3rd rowDischarged to home
4th rowDischarged to home
5th rowDischarged/transferred to home with home health service

Common Values

ValueCountFrequency (%)
Discharged to home 42256
59.3%
Discharged/transferred to SNF 9780
 
13.7%
Discharged/transferred to home with home health service 9005
 
12.6%
Discharged/transferred to another short term hospital 1488
 
2.1%
Discharged/transferred to another rehab fac including rehab units of a hospital . 1393
 
2.0%
Expired 1135
 
1.6%
Discharged/transferred to another type of inpatient care institution 822
 
1.2%
Not Mapped 679
 
1.0%
Discharged/transferred to ICF 571
 
0.8%
Left AMA 421
 
0.6%
Other values (15) 1096
 
1.5%
(Missing) 2590
 
3.6%

Length

2023-11-07T10:34:51.608629image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
to 65878
25.0%
home 60692
23.1%
discharged 42258
16.0%
discharged/transferred 23499
 
8.9%
snf 9780
 
3.7%
health 9008
 
3.4%
with 9005
 
3.4%
service 9005
 
3.4%
another 3712
 
1.4%
hospital 3372
 
1.3%
Other values (59) 27083
10.3%

Most occurring characters

ValueCountFrequency (%)
e 216005
11.5%
194646
10.4%
h 165172
 
8.8%
r 158998
 
8.5%
o 140206
 
7.5%
t 126692
 
6.8%
a 115434
 
6.2%
s 106629
 
5.7%
i 99598
 
5.3%
d 93734
 
5.0%
Other values (28) 453880
24.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1548376
82.8%
Space Separator 194646
 
10.4%
Uppercase Letter 101999
 
5.5%
Other Punctuation 25973
 
1.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 216005
14.0%
h 165172
10.7%
r 158998
10.3%
o 140206
9.1%
t 126692
8.2%
a 115434
7.5%
s 106629
6.9%
i 99598
6.4%
d 93734
 
6.1%
c 80622
 
5.2%
Other values (12) 245286
15.8%
Uppercase Letter
ValueCountFrequency (%)
D 65868
64.6%
N 10461
 
10.3%
F 10351
 
10.1%
S 9782
 
9.6%
M 1215
 
1.2%
E 1142
 
1.1%
A 855
 
0.8%
I 652
 
0.6%
H 600
 
0.6%
C 571
 
0.6%
Other values (2) 502
 
0.5%
Other Punctuation
ValueCountFrequency (%)
/ 24244
93.3%
. 1722
 
6.6%
, 7
 
< 0.1%
Space Separator
ValueCountFrequency (%)
194646
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1650375
88.2%
Common 220619
 
11.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 216005
13.1%
h 165172
10.0%
r 158998
9.6%
o 140206
8.5%
t 126692
 
7.7%
a 115434
 
7.0%
s 106629
 
6.5%
i 99598
 
6.0%
d 93734
 
5.7%
c 80622
 
4.9%
Other values (24) 347285
21.0%
Common
ValueCountFrequency (%)
194646
88.2%
/ 24244
 
11.0%
. 1722
 
0.8%
, 7
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1870994
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 216005
11.5%
194646
10.4%
h 165172
 
8.8%
r 158998
 
8.5%
o 140206
 
7.5%
t 126692
 
6.8%
a 115434
 
6.2%
s 106629
 
5.7%
i 99598
 
5.3%
d 93734
 
5.0%
Other values (28) 453880
24.3%

admission_source
Categorical

IMBALANCE  MISSING 

Distinct16
Distinct (%)< 0.1%
Missing4718
Missing (%)6.6%
Memory size556.7 KiB
Emergency Room
40319 
Physician Referral
20678 
Transfer from a hospital
 
2230
Transfer from another health care facility
 
1562
Clinic Referral
 
779
Other values (11)
 
950

Length

Max length58
Median length15
Mean length17.484801
Min length10

Characters and Unicode

Total characters1163054
Distinct characters42
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st row Emergency Room
2nd rowClinic Referral
3rd row Transfer from another health care facility
4th row Physician Referral
5th row Emergency Room

Common Values

ValueCountFrequency (%)
Emergency Room 40319
56.6%
Physician Referral 20678
29.0%
Transfer from a hospital 2230
 
3.1%
Transfer from another health care facility 1562
 
2.2%
Clinic Referral 779
 
1.1%
Transfer from a Skilled Nursing Facility (SNF) 595
 
0.8%
HMO Referral 129
 
0.2%
Not Mapped 107
 
0.2%
Not Available 88
 
0.1%
Court/Law Enforcement 11
 
< 0.1%
Other values (6) 20
 
< 0.1%
(Missing) 4718
 
6.6%

Length

2023-11-07T10:34:51.784200image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
emergency 40319
27.5%
room 40319
27.5%
referral 21586
14.7%
physician 20678
14.1%
transfer 4404
 
3.0%
from 4404
 
3.0%
a 2833
 
1.9%
hospital 2245
 
1.5%
facility 2157
 
1.5%
care 1562
 
1.1%
Other values (29) 6303
 
4.3%

Most occurring characters

ValueCountFrequency (%)
143671
12.4%
e 133751
11.5%
r 100472
 
8.6%
o 89069
 
7.7%
m 85073
 
7.3%
n 68377
 
5.9%
c 65544
 
5.6%
y 63160
 
5.4%
R 61905
 
5.3%
a 58927
 
5.1%
Other values (32) 293105
25.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 885697
76.2%
Space Separator 143671
 
12.4%
Uppercase Letter 132477
 
11.4%
Open Punctuation 595
 
0.1%
Close Punctuation 595
 
0.1%
Other Punctuation 19
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 133751
15.1%
r 100472
11.3%
o 89069
10.1%
m 85073
9.6%
n 68377
7.7%
c 65544
7.4%
y 63160
7.1%
a 58927
6.7%
i 50792
 
5.7%
g 40916
 
4.6%
Other values (13) 129616
14.6%
Uppercase Letter
ValueCountFrequency (%)
R 61905
46.7%
E 40331
30.4%
P 20678
 
15.6%
T 4404
 
3.3%
N 1386
 
1.0%
S 1193
 
0.9%
F 1190
 
0.9%
C 792
 
0.6%
M 236
 
0.2%
H 129
 
0.1%
Other values (5) 233
 
0.2%
Space Separator
ValueCountFrequency (%)
143671
100.0%
Open Punctuation
ValueCountFrequency (%)
( 595
100.0%
Close Punctuation
ValueCountFrequency (%)
) 595
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 19
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1018174
87.5%
Common 144880
 
12.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 133751
13.1%
r 100472
9.9%
o 89069
 
8.7%
m 85073
 
8.4%
n 68377
 
6.7%
c 65544
 
6.4%
y 63160
 
6.2%
R 61905
 
6.1%
a 58927
 
5.8%
i 50792
 
5.0%
Other values (28) 241104
23.7%
Common
ValueCountFrequency (%)
143671
99.2%
( 595
 
0.4%
) 595
 
0.4%
/ 19
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1163054
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
143671
12.4%
e 133751
11.5%
r 100472
 
8.6%
o 89069
 
7.7%
m 85073
 
7.3%
n 68377
 
5.9%
c 65544
 
5.6%
y 63160
 
5.4%
R 61905
 
5.3%
a 58927
 
5.1%
Other values (32) 293105
25.2%
Distinct14
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.3910242
Minimum1
Maximum14
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size556.7 KiB
2023-11-07T10:34:51.924734image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median4
Q36
95-th percentile11
Maximum14
Range13
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.9887391
Coefficient of variation (CV)0.68064738
Kurtosis0.84694054
Mean4.3910242
Median Absolute Deviation (MAD)2
Skewness1.1354771
Sum312799
Variance8.9325615
MonotonicityNot monotonic
2023-11-07T10:34:52.096662image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
3 12434
17.5%
2 12119
17.0%
1 10010
14.1%
4 9656
13.6%
5 6967
9.8%
6 5237
7.4%
7 4154
 
5.8%
8 3003
 
4.2%
9 2105
 
3.0%
10 1637
 
2.3%
Other values (4) 3914
 
5.5%
ValueCountFrequency (%)
1 10010
14.1%
2 12119
17.0%
3 12434
17.5%
4 9656
13.6%
5 6967
9.8%
6 5237
7.4%
7 4154
 
5.8%
8 3003
 
4.2%
9 2105
 
3.0%
10 1637
 
2.3%
ValueCountFrequency (%)
14 723
 
1.0%
13 859
 
1.2%
12 1010
 
1.4%
11 1322
 
1.9%
10 1637
 
2.3%
9 2105
 
3.0%
8 3003
4.2%
7 4154
5.8%
6 5237
7.4%
5 6967
9.8%

number_lab_tests
Real number (ℝ)

Distinct114
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean43.095654
Minimum1
Maximum121
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size556.7 KiB
2023-11-07T10:34:52.284140image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4
Q131
median44
Q357
95-th percentile73
Maximum121
Range120
Interquartile range (IQR)26

Descriptive statistics

Standard deviation19.642919
Coefficient of variation (CV)0.45579814
Kurtosis-0.25555922
Mean43.095654
Median Absolute Deviation (MAD)13
Skewness-0.2366487
Sum3069962
Variance385.84426
MonotonicityNot monotonic
2023-11-07T10:34:52.487242image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 2183
 
3.1%
43 1965
 
2.8%
44 1787
 
2.5%
45 1672
 
2.3%
40 1553
 
2.2%
46 1535
 
2.2%
38 1524
 
2.1%
41 1505
 
2.1%
47 1504
 
2.1%
39 1483
 
2.1%
Other values (104) 54525
76.5%
ValueCountFrequency (%)
1 2183
3.1%
2 788
 
1.1%
3 466
 
0.7%
4 268
 
0.4%
5 203
 
0.3%
6 212
 
0.3%
7 237
 
0.3%
8 251
 
0.4%
9 653
 
0.9%
10 585
 
0.8%
ValueCountFrequency (%)
121 1
 
< 0.1%
118 1
 
< 0.1%
114 1
 
< 0.1%
113 2
< 0.1%
111 3
< 0.1%
109 2
< 0.1%
108 2
< 0.1%
107 1
 
< 0.1%
106 4
< 0.1%
105 2
< 0.1%

non_lab_procedures
Real number (ℝ)

ZEROS 

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.3409231
Minimum0
Maximum6
Zeros32632
Zeros (%)45.8%
Negative0
Negative (%)0.0%
Memory size556.7 KiB
2023-11-07T10:34:52.643482image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q32
95-th percentile5
Maximum6
Range6
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.7066641
Coefficient of variation (CV)1.2727531
Kurtosis0.85393668
Mean1.3409231
Median Absolute Deviation (MAD)1
Skewness1.315643
Sum95522
Variance2.9127022
MonotonicityNot monotonic
2023-11-07T10:34:52.799715image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
0 32632
45.8%
1 14533
20.4%
2 8896
 
12.5%
3 6614
 
9.3%
6 3478
 
4.9%
4 2928
 
4.1%
5 2155
 
3.0%
ValueCountFrequency (%)
0 32632
45.8%
1 14533
20.4%
2 8896
 
12.5%
3 6614
 
9.3%
4 2928
 
4.1%
5 2155
 
3.0%
6 3478
 
4.9%
ValueCountFrequency (%)
6 3478
 
4.9%
5 2155
 
3.0%
4 2928
 
4.1%
3 6614
 
9.3%
2 8896
 
12.5%
1 14533
20.4%
0 32632
45.8%

number_of_medications
Real number (ℝ)

Distinct72
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.995452
Minimum1
Maximum75
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size556.7 KiB
2023-11-07T10:34:52.971574image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile6
Q110
median15
Q320
95-th percentile31
Maximum75
Range74
Interquartile range (IQR)10

Descriptive statistics

Standard deviation8.122347
Coefficient of variation (CV)0.50779104
Kurtosis3.4625852
Mean15.995452
Median Absolute Deviation (MAD)5
Skewness1.330452
Sum1139452
Variance65.972521
MonotonicityNot monotonic
2023-11-07T10:34:53.174668image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
13 4315
 
6.1%
12 4171
 
5.9%
15 4065
 
5.7%
11 4051
 
5.7%
14 4029
 
5.7%
16 3786
 
5.3%
10 3686
 
5.2%
17 3455
 
4.9%
9 3403
 
4.8%
18 3172
 
4.5%
Other values (62) 33103
46.5%
ValueCountFrequency (%)
1 183
 
0.3%
2 323
 
0.5%
3 639
 
0.9%
4 982
 
1.4%
5 1414
 
2.0%
6 1955
2.7%
7 2471
3.5%
8 3094
4.3%
9 3403
4.8%
10 3686
5.2%
ValueCountFrequency (%)
75 2
 
< 0.1%
74 1
 
< 0.1%
70 2
 
< 0.1%
69 4
 
< 0.1%
68 3
 
< 0.1%
67 5
< 0.1%
66 5
< 0.1%
65 7
< 0.1%
64 6
< 0.1%
63 10
< 0.1%
Distinct687
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size556.7 KiB
2023-11-07T10:34:53.477979image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Length

Max length6
Median length3
Mean length3.1732691
Min length1

Characters and Unicode

Total characters226051
Distinct characters13
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique101 ?
Unique (%)0.1%

Sample

1st row515
2nd row38
3rd row534
4th row569
5th row715
ValueCountFrequency (%)
428 4776
 
6.7%
414 4596
 
6.5%
786 2846
 
4.0%
410 2513
 
3.5%
486 2488
 
3.5%
427 1937
 
2.7%
491 1594
 
2.2%
715 1512
 
2.1%
682 1424
 
2.0%
780 1406
 
2.0%
Other values (677) 46144
64.8%
2023-11-07T10:34:53.962407image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4 38806
17.2%
2 27849
12.3%
8 26665
11.8%
5 26026
11.5%
7 20000
8.8%
1 19675
8.7%
0 17412
7.7%
6 16301
7.2%
9 13925
 
6.2%
3 12286
 
5.4%
Other values (3) 7106
 
3.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 218945
96.9%
Other Punctuation 5951
 
2.6%
Uppercase Letter 1155
 
0.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 38806
17.7%
2 27849
12.7%
8 26665
12.2%
5 26026
11.9%
7 20000
9.1%
1 19675
9.0%
0 17412
8.0%
6 16301
7.4%
9 13925
 
6.4%
3 12286
 
5.6%
Other Punctuation
ValueCountFrequency (%)
. 5935
99.7%
? 16
 
0.3%
Uppercase Letter
ValueCountFrequency (%)
V 1155
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 224896
99.5%
Latin 1155
 
0.5%

Most frequent character per script

Common
ValueCountFrequency (%)
4 38806
17.3%
2 27849
12.4%
8 26665
11.9%
5 26026
11.6%
7 20000
8.9%
1 19675
8.7%
0 17412
7.7%
6 16301
7.2%
9 13925
 
6.2%
3 12286
 
5.5%
Other values (2) 5951
 
2.6%
Latin
ValueCountFrequency (%)
V 1155
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 226051
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4 38806
17.2%
2 27849
12.3%
8 26665
11.8%
5 26026
11.5%
7 20000
8.8%
1 19675
8.7%
0 17412
7.7%
6 16301
7.2%
9 13925
 
6.2%
3 12286
 
5.4%
Other values (3) 7106
 
3.1%
Distinct699
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size556.7 KiB
2023-11-07T10:34:54.274866image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Length

Max length6
Median length3
Mean length3.1651272
Min length1

Characters and Unicode

Total characters225471
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique120 ?
Unique (%)0.2%

Sample

1st row276
2nd row785
3rd row135
4th row562
5th row599
ValueCountFrequency (%)
276 4694
 
6.6%
428 4685
 
6.6%
250 4257
 
6.0%
427 3537
 
5.0%
401 2624
 
3.7%
496 2292
 
3.2%
599 2291
 
3.2%
403 1983
 
2.8%
414 1851
 
2.6%
411 1777
 
2.5%
Other values (689) 41245
57.9%
2023-11-07T10:34:54.759090image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4 35807
15.9%
2 34803
15.4%
5 26746
11.9%
0 23894
10.6%
8 20127
8.9%
7 19935
8.8%
1 18270
8.1%
9 15286
6.8%
6 13954
 
6.2%
3 9926
 
4.4%
Other values (4) 6723
 
3.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 218748
97.0%
Other Punctuation 4943
 
2.2%
Uppercase Letter 1780
 
0.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 35807
16.4%
2 34803
15.9%
5 26746
12.2%
0 23894
10.9%
8 20127
9.2%
7 19935
9.1%
1 18270
8.4%
9 15286
7.0%
6 13954
 
6.4%
3 9926
 
4.5%
Other Punctuation
ValueCountFrequency (%)
. 4681
94.7%
? 262
 
5.3%
Uppercase Letter
ValueCountFrequency (%)
V 1265
71.1%
E 515
28.9%

Most occurring scripts

ValueCountFrequency (%)
Common 223691
99.2%
Latin 1780
 
0.8%

Most frequent character per script

Common
ValueCountFrequency (%)
4 35807
16.0%
2 34803
15.6%
5 26746
12.0%
0 23894
10.7%
8 20127
9.0%
7 19935
8.9%
1 18270
8.2%
9 15286
6.8%
6 13954
 
6.2%
3 9926
 
4.4%
Other values (2) 4943
 
2.2%
Latin
ValueCountFrequency (%)
V 1265
71.1%
E 515
28.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 225471
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4 35807
15.9%
2 34803
15.4%
5 26746
11.9%
0 23894
10.6%
8 20127
8.9%
7 19935
8.8%
1 18270
8.1%
9 15286
6.8%
6 13954
 
6.2%
3 9926
 
4.4%
Other values (4) 6723
 
3.0%
Distinct747
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size556.7 KiB
2023-11-07T10:34:55.071662image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Length

Max length6
Median length3
Mean length3.1083441
Min length1

Characters and Unicode

Total characters221426
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique125 ?
Unique (%)0.2%

Sample

1st row466
2nd row162
3rd row250
4th row455
5th row428
ValueCountFrequency (%)
250 8070
 
11.3%
401 5784
 
8.1%
276 3599
 
5.1%
428 3240
 
4.5%
427 2767
 
3.9%
414 2544
 
3.6%
496 1818
 
2.6%
403 1671
 
2.3%
599 1385
 
1.9%
585 1383
 
1.9%
Other values (737) 38975
54.7%
2023-11-07T10:34:55.549125image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 35849
16.2%
4 34550
15.6%
5 28879
13.0%
0 27661
12.5%
7 18441
8.3%
1 17159
7.7%
8 16685
7.5%
9 12202
 
5.5%
6 11507
 
5.2%
3 10086
 
4.6%
Other values (4) 8407
 
3.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 213019
96.2%
Other Punctuation 4865
 
2.2%
Uppercase Letter 3542
 
1.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 35849
16.8%
4 34550
16.2%
5 28879
13.6%
0 27661
13.0%
7 18441
8.7%
1 17159
8.1%
8 16685
7.8%
9 12202
 
5.7%
6 11507
 
5.4%
3 10086
 
4.7%
Other Punctuation
ValueCountFrequency (%)
. 3857
79.3%
? 1008
 
20.7%
Uppercase Letter
ValueCountFrequency (%)
V 2677
75.6%
E 865
 
24.4%

Most occurring scripts

ValueCountFrequency (%)
Common 217884
98.4%
Latin 3542
 
1.6%

Most frequent character per script

Common
ValueCountFrequency (%)
2 35849
16.5%
4 34550
15.9%
5 28879
13.3%
0 27661
12.7%
7 18441
8.5%
1 17159
7.9%
8 16685
7.7%
9 12202
 
5.6%
6 11507
 
5.3%
3 10086
 
4.6%
Other values (2) 4865
 
2.2%
Latin
ValueCountFrequency (%)
V 2677
75.6%
E 865
 
24.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 221426
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 35849
16.2%
4 34550
15.6%
5 28879
13.0%
0 27661
12.5%
7 18441
8.3%
1 17159
7.7%
8 16685
7.5%
9 12202
 
5.5%
6 11507
 
5.2%
3 10086
 
4.6%
Other values (4) 8407
 
3.8%

number_diagnoses
Real number (ℝ)

Distinct16
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.4210231
Minimum1
Maximum16
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size556.7 KiB
2023-11-07T10:34:55.720979image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4
Q16
median8
Q39
95-th percentile9
Maximum16
Range15
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.9378085
Coefficient of variation (CV)0.26112417
Kurtosis-0.068695161
Mean7.4210231
Median Absolute Deviation (MAD)1
Skewness-0.87821826
Sum528644
Variance3.7551018
MonotonicityNot monotonic
2023-11-07T10:34:55.877206image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=16)
ValueCountFrequency (%)
9 34669
48.7%
5 7921
 
11.1%
8 7375
 
10.4%
7 7264
 
10.2%
6 7134
 
10.0%
4 3903
 
5.5%
3 1994
 
2.8%
2 727
 
1.0%
1 164
 
0.2%
16 33
 
< 0.1%
Other values (6) 52
 
0.1%
ValueCountFrequency (%)
1 164
 
0.2%
2 727
 
1.0%
3 1994
 
2.8%
4 3903
 
5.5%
5 7921
 
11.1%
6 7134
 
10.0%
7 7264
 
10.2%
8 7375
 
10.4%
9 34669
48.7%
10 15
 
< 0.1%
ValueCountFrequency (%)
16 33
 
< 0.1%
15 6
 
< 0.1%
14 4
 
< 0.1%
13 13
 
< 0.1%
12 7
 
< 0.1%
11 7
 
< 0.1%
10 15
 
< 0.1%
9 34669
48.7%
8 7375
 
10.4%
7 7264
 
10.2%

glucose_test_result
Categorical

HIGH CORRELATION  MISSING 

Distinct3
Distinct (%)0.1%
Missing67548
Missing (%)94.8%
Memory size556.7 KiB
Norm
1806 
>200
1055 
>300
827 

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters14752
Distinct characters8
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row>300
2nd rowNorm
3rd row>200
4th rowNorm
5th rowNorm

Common Values

ValueCountFrequency (%)
Norm 1806
 
2.5%
>200 1055
 
1.5%
>300 827
 
1.2%
(Missing) 67548
94.8%

Length

2023-11-07T10:34:56.064686image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-07T10:34:56.213806image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
norm 1806
49.0%
200 1055
28.6%
300 827
22.4%

Most occurring characters

ValueCountFrequency (%)
0 3764
25.5%
> 1882
12.8%
N 1806
12.2%
o 1806
12.2%
r 1806
12.2%
m 1806
12.2%
2 1055
 
7.2%
3 827
 
5.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 5646
38.3%
Lowercase Letter 5418
36.7%
Math Symbol 1882
 
12.8%
Uppercase Letter 1806
 
12.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 3764
66.7%
2 1055
 
18.7%
3 827
 
14.6%
Lowercase Letter
ValueCountFrequency (%)
o 1806
33.3%
r 1806
33.3%
m 1806
33.3%
Math Symbol
ValueCountFrequency (%)
> 1882
100.0%
Uppercase Letter
ValueCountFrequency (%)
N 1806
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 7528
51.0%
Latin 7224
49.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 3764
50.0%
> 1882
25.0%
2 1055
 
14.0%
3 827
 
11.0%
Latin
ValueCountFrequency (%)
N 1806
25.0%
o 1806
25.0%
r 1806
25.0%
m 1806
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14752
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 3764
25.5%
> 1882
12.8%
N 1806
12.2%
o 1806
12.2%
r 1806
12.2%
m 1806
12.2%
2 1055
 
7.2%
3 827
 
5.6%

a1c_test_result
Categorical

MISSING 

Distinct3
Distinct (%)< 0.1%
Missing59320
Missing (%)83.3%
Memory size556.7 KiB
>8
5705 
Norm
3503 
>7
2708 

Length

Max length4
Median length2
Mean length2.587949
Min length2

Characters and Unicode

Total characters30838
Distinct characters7
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNorm
2nd row>8
3rd row>7
4th rowNorm
5th rowNorm

Common Values

ValueCountFrequency (%)
>8 5705
 
8.0%
Norm 3503
 
4.9%
>7 2708
 
3.8%
(Missing) 59320
83.3%

Length

2023-11-07T10:34:56.385677image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-07T10:34:56.526271image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
8 5705
47.9%
norm 3503
29.4%
7 2708
22.7%

Most occurring characters

ValueCountFrequency (%)
> 8413
27.3%
8 5705
18.5%
N 3503
11.4%
o 3503
11.4%
r 3503
11.4%
m 3503
11.4%
7 2708
 
8.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 10509
34.1%
Math Symbol 8413
27.3%
Decimal Number 8413
27.3%
Uppercase Letter 3503
 
11.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 3503
33.3%
r 3503
33.3%
m 3503
33.3%
Decimal Number
ValueCountFrequency (%)
8 5705
67.8%
7 2708
32.2%
Math Symbol
ValueCountFrequency (%)
> 8413
100.0%
Uppercase Letter
ValueCountFrequency (%)
N 3503
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 16826
54.6%
Latin 14012
45.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 3503
25.0%
o 3503
25.0%
r 3503
25.0%
m 3503
25.0%
Common
ValueCountFrequency (%)
> 8413
50.0%
8 5705
33.9%
7 2708
 
16.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 30838
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
> 8413
27.3%
8 5705
18.5%
N 3503
11.4%
o 3503
11.4%
r 3503
11.4%
m 3503
11.4%
7 2708
 
8.8%

change_in_meds_during_hospitalization
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size556.7 KiB
No
38326 
Ch
32910 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters142472
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowCh
4th rowNo
5th rowNo

Common Values

ValueCountFrequency (%)
No 38326
53.8%
Ch 32910
46.2%

Length

2023-11-07T10:34:56.682523image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-07T10:34:56.838639image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
no 38326
53.8%
ch 32910
46.2%

Most occurring characters

ValueCountFrequency (%)
N 38326
26.9%
o 38326
26.9%
C 32910
23.1%
h 32910
23.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 71236
50.0%
Lowercase Letter 71236
50.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 38326
53.8%
C 32910
46.2%
Lowercase Letter
ValueCountFrequency (%)
o 38326
53.8%
h 32910
46.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 142472
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 38326
26.9%
o 38326
26.9%
C 32910
23.1%
h 32910
23.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 142472
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 38326
26.9%
o 38326
26.9%
C 32910
23.1%
h 32910
23.1%

prescribed_diabetes_meds
Boolean

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size69.7 KiB
True
54890 
False
16346 
ValueCountFrequency (%)
True 54890
77.1%
False 16346
 
22.9%
2023-11-07T10:34:56.963724image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Distinct303
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size556.7 KiB
2023-11-07T10:34:57.073089image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Length

Max length91
Median length82
Mean length15.408249
Min length2

Characters and Unicode

Total characters1097622
Distinct characters27
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique104 ?
Unique (%)0.1%

Sample

1st row[]
2nd row['insulin']
3rd row['glimepiride', 'insulin']
4th row[]
5th row[]
ValueCountFrequency (%)
insulin 38105
38.0%
16346
16.3%
metformin 13965
 
13.9%
glipizide 8906
 
8.9%
glyburide 7520
 
7.5%
pioglitazone 5092
 
5.1%
rosiglitazone 4473
 
4.5%
glimepiride 3574
 
3.6%
repaglinide 1078
 
1.1%
glyburide-metformin 497
 
0.5%
Other values (12) 827
 
0.8%
2023-11-07T10:34:57.432435image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
' 168074
15.3%
i 158582
14.4%
n 102289
 
9.3%
[ 71236
 
6.5%
] 71236
 
6.5%
l 69893
 
6.4%
e 51549
 
4.7%
u 46139
 
4.2%
s 42785
 
3.9%
o 34002
 
3.1%
Other values (17) 281837
25.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 728271
66.3%
Other Punctuation 197221
 
18.0%
Open Punctuation 71236
 
6.5%
Close Punctuation 71236
 
6.5%
Space Separator 29147
 
2.7%
Dash Punctuation 511
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 158582
21.8%
n 102289
14.0%
l 69893
9.6%
e 51549
 
7.1%
u 46139
 
6.3%
s 42785
 
5.9%
o 34002
 
4.7%
m 32652
 
4.5%
r 31935
 
4.4%
g 31662
 
4.3%
Other values (11) 126783
17.4%
Other Punctuation
ValueCountFrequency (%)
' 168074
85.2%
, 29147
 
14.8%
Open Punctuation
ValueCountFrequency (%)
[ 71236
100.0%
Close Punctuation
ValueCountFrequency (%)
] 71236
100.0%
Space Separator
ValueCountFrequency (%)
29147
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 511
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 728271
66.3%
Common 369351
33.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 158582
21.8%
n 102289
14.0%
l 69893
9.6%
e 51549
 
7.1%
u 46139
 
6.3%
s 42785
 
5.9%
o 34002
 
4.7%
m 32652
 
4.5%
r 31935
 
4.4%
g 31662
 
4.3%
Other values (11) 126783
17.4%
Common
ValueCountFrequency (%)
' 168074
45.5%
[ 71236
19.3%
] 71236
19.3%
, 29147
 
7.9%
29147
 
7.9%
- 511
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1097622
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
' 168074
15.3%
i 158582
14.4%
n 102289
 
9.3%
[ 71236
 
6.5%
] 71236
 
6.5%
l 69893
 
6.4%
e 51549
 
4.7%
u 46139
 
4.2%
s 42785
 
3.9%
o 34002
 
3.1%
Other values (17) 281837
25.7%

readmitted_binary
Boolean

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size69.7 KiB
False
63286 
True
7950 
ValueCountFrequency (%)
False 63286
88.8%
True 7950
 
11.2%
2023-11-07T10:34:57.588587image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

readmitted_multiclass
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size556.7 KiB
No
38405 
>30 days
24881 
<30 days
7950 

Length

Max length8
Median length2
Mean length4.7652591
Min length2

Characters and Unicode

Total characters339458
Distinct characters11
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row>30 days
2nd rowNo
3rd rowNo
4th rowNo
5th row>30 days

Common Values

ValueCountFrequency (%)
No 38405
53.9%
>30 days 24881
34.9%
<30 days 7950
 
11.2%

Length

2023-11-07T10:34:57.744821image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-07T10:34:57.885498image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
no 38405
36.9%
30 32831
31.5%
days 32831
31.5%

Most occurring characters

ValueCountFrequency (%)
N 38405
11.3%
o 38405
11.3%
3 32831
9.7%
0 32831
9.7%
32831
9.7%
d 32831
9.7%
a 32831
9.7%
y 32831
9.7%
s 32831
9.7%
> 24881
7.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 169729
50.0%
Decimal Number 65662
 
19.3%
Uppercase Letter 38405
 
11.3%
Space Separator 32831
 
9.7%
Math Symbol 32831
 
9.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 38405
22.6%
d 32831
19.3%
a 32831
19.3%
y 32831
19.3%
s 32831
19.3%
Decimal Number
ValueCountFrequency (%)
3 32831
50.0%
0 32831
50.0%
Math Symbol
ValueCountFrequency (%)
> 24881
75.8%
< 7950
 
24.2%
Uppercase Letter
ValueCountFrequency (%)
N 38405
100.0%
Space Separator
ValueCountFrequency (%)
32831
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 208134
61.3%
Common 131324
38.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 38405
18.5%
o 38405
18.5%
d 32831
15.8%
a 32831
15.8%
y 32831
15.8%
s 32831
15.8%
Common
ValueCountFrequency (%)
3 32831
25.0%
0 32831
25.0%
32831
25.0%
> 24881
18.9%
< 7950
 
6.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 339458
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 38405
11.3%
o 38405
11.3%
3 32831
9.7%
0 32831
9.7%
32831
9.7%
d 32831
9.7%
a 32831
9.7%
y 32831
9.7%
s 32831
9.7%
> 24881
7.3%

Interactions

2023-11-07T10:34:42.833416image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:26.456983image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:28.022202image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:29.705781image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:31.291004image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:32.955876image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:34.527759image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:36.352143image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:37.968064image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:39.594633image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:41.204287image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:42.966556image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:26.587641image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:28.170470image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:29.839383image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:31.436563image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:33.089985image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:34.817191image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:36.488685image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:38.108318image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:39.730534image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:41.345436image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:43.125810image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:26.742323image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:28.317763image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:29.988267image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:31.595145image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:33.240249image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:34.979746image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:36.646991image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:38.267350image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:39.882746image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:41.494719image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:43.263427image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:26.872049image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:28.464770image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:30.120919image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:31.732462image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:33.373930image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:35.125834image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:36.783031image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:38.405148image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:40.022736image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:41.639400image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:43.416012image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:27.020013image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:28.623262image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:30.273669image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:31.891428image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:33.517941image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:35.282213image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:36.927628image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:38.558770image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:40.173689image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:41.792997image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:43.554290image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:27.151008image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:28.772587image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:30.405437image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:32.033043image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:33.651676image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:35.426860image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:37.071917image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:38.695608image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:40.310035image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:41.931300image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:43.710642image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:27.305601image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:28.935406image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:30.560147image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:32.192064image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:33.808836image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:35.586103image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:37.231557image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:38.851073image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:40.465192image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:42.084376image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:43.856259image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:27.447103image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:29.089989image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:30.705835image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:32.344779image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:33.945208image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:35.733906image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:37.374360image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:39.000773image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:40.611836image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:42.237615image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:44.140580image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:27.589976image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:29.240686image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:30.849597image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:32.488481image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:34.091406image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:35.880973image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:37.520174image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:39.140012image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:40.761054image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:42.383543image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:44.298800image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:27.726956image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:29.392418image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:30.989868image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:32.644791image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:34.231269image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:36.038408image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:37.659729image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:39.291068image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:40.905166image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:42.532800image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:44.443860image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:27.881825image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:29.547006image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:31.136716image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:32.793716image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:34.378662image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:36.196078image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:37.815465image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:39.441800image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:41.055392image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-07T10:34:42.675517image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Correlations

2023-11-07T10:34:58.041741image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
encounter_idpatient_idoutpatient_visits_in_previous_yearemergency_visits_in_previous_yearinpatient_visits_in_previous_yearaverage_pulse_bpmlength_of_stay_in_hospitalnumber_lab_testsnon_lab_proceduresnumber_of_medicationsnumber_diagnosesracegenderageweightpayer_codeadmission_typedischarge_dispositionadmission_sourceglucose_test_resulta1c_test_resultchange_in_meds_during_hospitalizationprescribed_diabetes_medsreadmitted_binaryreadmitted_multiclass
encounter_id1.000-0.0290.003-0.0040.0060.0050.0000.006-0.001-0.004-0.0140.0460.0040.0170.0120.1460.0560.0490.0860.1440.0870.0940.0470.0070.060
patient_id-0.0291.0000.1550.1120.0250.002-0.0170.026-0.0210.0470.2380.1050.0250.0400.0370.1750.1380.0740.0990.1570.1160.1290.0690.0380.114
outpatient_visits_in_previous_year0.0030.1551.0000.1790.1550.007-0.012-0.024-0.0220.0730.1110.0120.0000.0000.0190.0240.0190.0170.0000.0000.0210.0150.0020.0000.029
emergency_visits_in_previous_year-0.0040.1120.1791.0000.2200.003-0.0020.006-0.0440.0440.0920.0000.0020.0290.0000.0360.0000.0000.0000.0080.0060.0150.0050.0290.028
inpatient_visits_in_previous_year0.0060.0250.1550.2201.0000.0090.0950.043-0.0630.1020.1380.0150.0080.0480.0160.0300.0180.0220.0170.0650.0210.0130.0180.1430.130
average_pulse_bpm0.0050.0020.0070.0030.0091.0000.008-0.0010.0030.003-0.0010.0020.0050.0000.0050.0000.0020.0000.0030.0000.0070.0030.0000.0040.002
length_of_stay_in_hospital0.000-0.017-0.012-0.0020.0950.0081.0000.3350.1870.4640.2380.0140.0270.0450.0100.0330.0280.1130.0380.1340.0190.1150.0690.0490.047
number_lab_tests0.0060.026-0.0240.0060.043-0.0010.3351.0000.0200.2480.1710.0420.0170.0230.0350.0480.1720.0570.0920.1450.0320.0760.0470.0180.033
non_lab_procedures-0.001-0.021-0.022-0.044-0.0630.0030.1870.0201.0000.3510.0680.0260.0480.0670.0110.0430.1170.0560.1180.0220.0190.0270.0320.0220.036
number_of_medications-0.0040.0470.0730.0440.1020.0030.4640.2480.3511.0000.2950.0330.0380.0640.0070.0400.0700.0790.0520.1200.0330.2480.2000.0470.066
number_diagnoses-0.0140.2380.1110.0920.138-0.0010.2380.1710.0680.2951.0000.0640.0000.1330.0210.0780.0700.0770.1140.0480.1120.0540.0310.0510.086
race0.0460.1050.0120.0000.0150.0020.0140.0420.0260.0330.0641.0000.0560.0850.0350.0860.0580.0450.0920.0000.0660.0170.0250.0170.038
gender0.0040.0250.0000.0020.0080.0050.0270.0170.0480.0380.0000.0561.0000.0790.0260.0610.0090.0650.0200.0000.0410.0170.0150.0030.015
age0.0170.0400.0000.0290.0480.0000.0450.0230.0670.0640.1330.0850.0791.0000.0290.1540.0440.1300.0440.1230.1840.0540.0460.0320.038
weight0.0120.0370.0190.0000.0160.0050.0100.0350.0110.0070.0210.0350.0260.0291.0000.0560.0230.0140.0431.0000.0160.0460.0310.0000.036
payer_code0.1460.1750.0240.0360.0300.0000.0330.0480.0430.0400.0780.0860.0610.1540.0561.0000.0860.0720.0730.1010.1540.1500.0970.0320.050
admission_type0.0560.1380.0190.0000.0180.0020.0280.1720.1170.0700.0700.0580.0090.0440.0230.0861.0000.0770.3380.1230.0410.0500.0310.0120.036
discharge_disposition0.0490.0740.0170.0000.0220.0000.1130.0570.0560.0790.0770.0450.0650.1300.0140.0720.0771.0000.0590.0630.0880.0770.0560.1270.133
admission_source0.0860.0990.0000.0000.0170.0030.0380.0920.1180.0520.1140.0920.0200.0440.0430.0730.3380.0591.0000.1130.0220.0480.0330.0180.076
glucose_test_result0.1440.1570.0000.0080.0650.0000.1340.1450.0220.1200.0480.0000.0000.1231.0000.1010.1230.0630.1131.0000.3730.2430.1890.0080.055
a1c_test_result0.0870.1160.0210.0060.0210.0070.0190.0320.0190.0330.1120.0660.0410.1840.0160.1540.0410.0880.0220.3731.0000.1910.1860.0000.018
change_in_meds_during_hospitalization0.0940.1290.0150.0150.0130.0030.1150.0760.0270.2480.0540.0170.0170.0540.0460.1500.0500.0770.0480.2430.1911.0000.5060.0200.045
prescribed_diabetes_meds0.0470.0690.0020.0050.0180.0000.0690.0470.0320.2000.0310.0250.0150.0460.0310.0970.0310.0560.0330.1890.1860.5061.0000.0270.063
readmitted_binary0.0070.0380.0000.0290.1430.0040.0490.0180.0220.0470.0510.0170.0030.0320.0000.0320.0120.1270.0180.0080.0000.0200.0271.0001.000
readmitted_multiclass0.0600.1140.0290.0280.1300.0020.0470.0330.0360.0660.0860.0380.0150.0380.0360.0500.0360.1330.0760.0550.0180.0450.0631.0001.000

Missing values

2023-11-07T10:34:44.737790image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
A simple visualization of nullity by column.
2023-11-07T10:34:45.416844image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-11-07T10:34:45.989202image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

encounter_idcountrypatient_idracegenderageweightpayer_codeoutpatient_visits_in_previous_yearemergency_visits_in_previous_yearinpatient_visits_in_previous_yearadmission_typemedical_specialtyaverage_pulse_bpmdischarge_dispositionadmission_sourcelength_of_stay_in_hospitalnumber_lab_testsnon_lab_proceduresnumber_of_medicationsprimary_diagnosissecondary_diagnosisadditional_diagnosisnumber_diagnosesglucose_test_resulta1c_test_resultchange_in_meds_during_hospitalizationprescribed_diabetes_medsmedicationreadmitted_binaryreadmitted_multiclass
0533192USA70110CaucasianFemale[70-80)??002EmergencyFamily/GeneralPractice86Discharged to homeEmergency Room2480205152764668NaNNaNNoNo[]No>30 days
1426845USA29775006AfricanAmericanMale[50-60)??000Emergency?82Discharged/transferred to a federal health care facility.Clinic Referral1471025387851629NaNNaNNoYes['insulin']NoNo
2110358USA80729253CaucasianFemale[60-70)??001NaNFamily/GeneralPractice88Discharged to homeNaN6601225341352506NaNNaNChYes['glimepiride', 'insulin']NoNo
3628963USA2919042AfricanAmericanMale[60-70)?MC001EmergencyInternalMedicine129Discharged to homeTransfer from another health care facility648295695624555NaNNaNNoNo[]NoNo
4130580USA84871971CaucasianFemale[70-80)?HM100Elective?121Discharged/transferred to home with home health servicePhysician Referral6471157155994289NaNNaNNoNo[]No>30 days
5269844USA279288CaucasianFemale[50-60)??000EmergencySurgery-General117Discharged to homeEmergency Room3582105742502443NaNNormNoNo[]No>30 days
6968846USA1566405CaucasianFemale[50-60)?UN000Emergency?108Discharged to homeEmergency Room159313786250.024939NaN>8ChYes['metformin', 'glimepiride']NoNo
7702581USA60052095OtherMale[70-80)?MC000ElectiveRadiologist100Discharged/transferred to home with home health servicePhysician Referral6566394142872769NaNNaNChYes['nateglinide', 'glipizide', 'insulin']NoNo
8943736USA85756257CaucasianFemale[50-60)?MC100Urgent?133Discharged to homePhysician Referral355016682250.022769NaNNaNNoYes['insulin']NoNo
9760073USA96104214CaucasianFemale[70-80)?MC000Elective?70Discharged to homePhysician Referral8405214253987459NaNNaNNoYes['insulin']No>30 days
encounter_idcountrypatient_idracegenderageweightpayer_codeoutpatient_visits_in_previous_yearemergency_visits_in_previous_yearinpatient_visits_in_previous_yearadmission_typemedical_specialtyaverage_pulse_bpmdischarge_dispositionadmission_sourcelength_of_stay_in_hospitalnumber_lab_testsnon_lab_proceduresnumber_of_medicationsprimary_diagnosissecondary_diagnosisadditional_diagnosisnumber_diagnosesglucose_test_resulta1c_test_resultchange_in_meds_during_hospitalizationprescribed_diabetes_medsmedicationreadmitted_binaryreadmitted_multiclass
71226716633USA42451245CaucasianMale[60-70)?MC001Elective?62Discharged to homePhysician Referral641014V57V45250.69NaNNaNChYes['metformin', 'insulin']No>30 days
71227275537USA23904909CaucasianFemale[80-90)??001Not AvailableInternalMedicine105Discharged/transferred to SNFNaN7230205075992508>200NaNChYes['insulin']No>30 days
71228368749USA82494909CaucasianFemale[70-80)??010NaNOrthopedics-Reconstructive95Discharged/transferred to another rehab fac including rehab units of a hospital .NaN3371208244012509NaNNaNChYes['glipizide', 'pioglitazone']NoNo
71229550555USA102008835CaucasianMale[70-80)??002Urgent?94Discharged to homeClinic Referral12800254284964279NaNNaNChYes['insulin']NoNo
71230495864USA65777877CaucasianFemale[70-80)?MC311Emergency?106Discharged/transferred to SNFEmergency Room4570265849974549NaNNaNChYes['metformin', 'insulin']No>30 days
71231136517USA24531381AfricanAmericanFemale[80-90)?MC010ElectiveInternalMedicine108Discharged to homePhysician Referral7511184537862509NaN>7ChYes['metformin', 'glyburide', 'insulin']NoNo
71232825966USA4663818AfricanAmericanFemale[70-80)??000Urgent?66NaNPhysician Referral92009157197V666NaN>7ChYes['metformin', 'glyburide']NoNo
71233332023USA23397147CaucasianFemale[60-70)??022NaN?71Not MappedNaN5250244284912769>300NaNChYes['glyburide', 'insulin']Yes<30 days
71234495516USA52161750CaucasianMale[60-70)?BC002EmergencyEmergency/Trauma98Discharged/transferred to SNFEmergency Room234113820157250.85NaNNaNChYes['glyburide', 'insulin']NoNo
71235721770USA88410897CaucasianMale[70-80)?BC001UrgentSurgery-General108Discharged to homeEmergency Room542294342504018NaNNaNChYes['insulin']No>30 days